{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Relationships in the GO\n",
    "_Alex Warwick Vesztrocy, March 2016_"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "For some analyses, it is possible to only use the <code>is_a</code> definitions given in the Gene Ontology. \n",
    "\n",
    "However, it is important to remember that this isn't always the case. As such, <code>GOATOOLS</code> includes the option to load the relationship definitions also."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Loading GO graph with the relationship tags\n",
    "This is possible by using the <code>optional_attrs</code> argument, upon instantiating a <code>GODag</code>. This can be done with either the full or the basic version of the ontology. Here, the __full__ version shall be used. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "100% [........................................................................] 34671316 / 34671316go.obo: format-version(1.2) data-version(releases/2016-04-27)"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "load obo file go.obo\n",
      "46518"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      " nodes imported\n"
     ]
    }
   ],
   "source": [
    "from goatools.obo_parser import GODag\n",
    "import wget\n",
    "\n",
    "go_fn = wget.download('http://geneontology.org/ontology/go.obo')\n",
    "go = GODag(go_fn, optional_attrs=['relationship'])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Viewing relationships in the GO graph"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "So now, when looking at an individual term (which has relationships defined in the GO) these are listed in a nested manner. As an example, look at <code>GO:1901990</code> which has a single <code>has_part</code> relationship, as well as a <code>regulates</code> one."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "eg_term = go['GO:1901990']"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "GOTerm('GO:1901990'):\n",
       "  name:regulation of mitotic cell cycle phase transition\n",
       "  relationship: 2 items\n",
       "    regulates: 1 items\n",
       "      GO:0044772\tlevel-05\tdepth-05\tmitotic cell cycle phase transition [biological_process] \n",
       "    has_part: 1 items\n",
       "      GO:0051437\tlevel-07\tdepth-12\tpositive regulation of ubiquitin-protein ligase activity involved in regulation of mitotic cell cycle transition [biological_process] \n",
       "  level:6\n",
       "  is_obsolete:False\n",
       "  namespace:biological_process\n",
       "  id:GO:1901990\n",
       "  depth:7\n",
       "  parents: 2 items\n",
       "    GO:0007346\tlevel-05\tdepth-05\tregulation of mitotic cell cycle [biological_process] \n",
       "    GO:1901987\tlevel-06\tdepth-06\tregulation of cell cycle phase transition [biological_process] \n",
       "  children: 7 items\n",
       "    GO:1901991\tlevel-07\tdepth-08\tnegative regulation of mitotic cell cycle phase transition [biological_process] \n",
       "    GO:1901992\tlevel-07\tdepth-08\tpositive regulation of mitotic cell cycle phase transition [biological_process] \n",
       "    GO:2000045\tlevel-07\tdepth-08\tregulation of G1/S transition of mitotic cell cycle [biological_process] \n",
       "    GO:0010389\tlevel-07\tdepth-08\tregulation of G2/M transition of mitotic cell cycle [biological_process] \n",
       "    GO:0007096\tlevel-07\tdepth-08\tregulation of exit from mitosis [biological_process] \n",
       "    GO:0030071\tlevel-07\tdepth-10\tregulation of mitotic metaphase/anaphase transition [biological_process] \n",
       "    GO:0060564\tlevel-07\tdepth-13\tnegative regulation of APC-Cdc20 complex activity [biological_process] \n",
       "  _parents: 2 items\n",
       "    GO:0007346\n",
       "    GO:1901987\n",
       "  alt_ids: 0 items"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "eg_term"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "These different relationship types are stored as a dictionary within the relationship attribute on a GO term."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "['regulates', 'has_part']\n"
     ]
    }
   ],
   "source": [
    "print(eg_term.relationship.keys())"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "set([GOTerm('GO:0044772'):\n",
      "  name:mitotic cell cycle phase transition\n",
      "  relationship: 1 items\n",
      "    part_of: 1 items\n",
      "      GO:0000278\tlevel-04\tdepth-04\tmitotic cell cycle [biological_process] \n",
      "  level:5\n",
      "  is_obsolete:False\n",
      "  namespace:biological_process\n",
      "  id:GO:0044772\n",
      "  depth:5\n",
      "  parents: 2 items\n",
      "    GO:0044770\tlevel-04\tdepth-04\tcell cycle phase transition [biological_process] \n",
      "    GO:1903047\tlevel-04\tdepth-04\tmitotic cell cycle process [biological_process] \n",
      "  children: 4 items\n",
      "    GO:0000086\tlevel-06\tdepth-06\tG2/M transition of mitotic cell cycle [biological_process] \n",
      "    GO:0000082\tlevel-06\tdepth-06\tG1/S transition of mitotic cell cycle [biological_process] \n",
      "    GO:0010458\tlevel-06\tdepth-06\texit from mitosis [biological_process] \n",
      "    GO:0007091\tlevel-06\tdepth-10\tmetaphase/anaphase transition of mitotic cell cycle [biological_process] \n",
      "  _parents: 2 items\n",
      "    GO:0044770\n",
      "    GO:1903047\n",
      "  alt_ids: 0 items])\n"
     ]
    }
   ],
   "source": [
    "print(eg_term.relationship['regulates'])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Example use case\n",
    "One example use case for the relationship terms, would be to look for all functions which regulate pseudohyphal growth (<code>GO:0007124</code>). That is:\n",
    "\n",
    "> A pattern of cell growth that occurs in conditions of nitrogen limitation and abundant fermentable carbon source. Cells become elongated, switch to a unipolar budding pattern, remain physically attached to each other, and invade the growth substrate. \n",
    ">\n",
    "> _Source: https://www.ebi.ac.uk/QuickGO/GTerm?id=GO:0007124#term=info&info=1_"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "term_of_interest = go['GO:0007124']"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "First, find the relationship types which contain \"regulates\":"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "frozenset(['regulates', 'negatively_regulates', 'positively_regulates'])\n"
     ]
    }
   ],
   "source": [
    "regulates = frozenset([typedef \n",
    "                       for typedef in go.typedefs.keys() \n",
    "                       if 'regulates' in typedef])\n",
    "print(regulates)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now, search through the terms in the tree for those with a relationship in this list and add them to a dictionary dependent on the type of regulation."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "from collections import defaultdict\n",
    "\n",
    "regulating_terms = defaultdict(list)\n",
    "\n",
    "for t in go.values():\n",
    "    if hasattr(t, 'relationship'):\n",
    "        for typedef in regulates.intersection(t.relationship.keys()):\n",
    "            if term_of_interest in t.relationship[typedef]:\n",
    "                regulating_terms['{:s}d_by'.format(typedef[:-1])].append(t)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now <code>regulating_terms</code> contains the GO terms which relate to regulating protein localisation to the nucleolus."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "pseudohyphal growth (GO:0007124) is:\n",
      "\n",
      "  - negatively_regulated_by:\n",
      "    -- GO:2000221 negative regulation of pseudohyphal growth\n",
      "    -- GO:0100042 negative regulation of pseudohyphal growth by transcription from RNA polymerase II promoter\n",
      "    -- GO:1900462 negative regulation of pseudohyphal growth by negative regulation of transcription from RNA polymerase II promoter\n",
      "\n",
      "  - regulated_by:\n",
      "    -- GO:2000220 regulation of pseudohyphal growth\n",
      "\n",
      "  - positively_regulated_by:\n",
      "    -- GO:2000222 positive regulation of pseudohyphal growth\n",
      "    -- GO:0100041 positive regulation of pseudohyphal growth by transcription from RNA polymerase II promoter\n",
      "    -- GO:1900461 positive regulation of pseudohyphal growth by positive regulation of transcription from RNA polymerase II promoter\n"
     ]
    }
   ],
   "source": [
    "print('{:s} ({:s}) is:'.format(term_of_interest.name, term_of_interest.id))\n",
    "for (k, v) in regulating_terms.items():\n",
    "    print('\\n  - {:s}:'.format(k))\n",
    "    for t in v:\n",
    "        print('    -- {:s} {:s}'.format(t.id, t.name))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 2",
   "language": "python",
   "name": "python2"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 2
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython2",
   "version": "2.7.11"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 0
}
